Network intrusion detection plays a vital role in protecting modern computer networks and IoT environments from cyber threats. Traditional machine learning approaches often focus on improving detection accuracy without providing insight into decision-making processes. This paper proposes an explainable intrusion detection framework using a Random Forest classifier trained on the CICIDS2017 dataset. The dataset includes realistic benign and malicious network traffic, enabling effective model evaluation. The proposed model achieves an accuracy of 92% in distinguishing benign and attack traffic. To enhance transparency and interpretability, SHAP (SHapley Additive exPlanations) is employed to analyze feature contributions influencing predictions. Experimental results demonstrate that flow-based and packet-level features significantly impact attack detection. The integration of explainable AI improves trust and usability of machine learning-based intrusion detection systems in real-world cyber security applications.
Introduction
The text presents an explainable machine learning–based approach for network intrusion detection in response to the growing sophistication of cyber attacks in internet-connected and IoT environments. Traditional signature-based intrusion detection systems are limited in their ability to detect novel or evolving threats, while many machine learning–based IDS models lack interpretability, reducing trust and hindering adoption in critical cyber security settings.
To address this issue, the study proposes an intrusion detection framework using a Random Forest classifier combined with SHAP (SHapley Additive exPlanations) to provide feature-level interpretability. The CICIDS2017 dataset, which reflects realistic modern network traffic and attack scenarios, is used for evaluation. The dataset contains flow-based representations of network traffic with 78 statistical features and was generated in a controlled environment over five days, covering both benign and malicious activities.
The methodology includes systematic data preprocessing, feature scaling, and stratified data splitting, followed by training a Random Forest model with controlled depth to balance accuracy and efficiency. Experimental results demonstrate strong classification performance using standard evaluation metrics, while SHAP analysis offers transparent insights into how individual features influence intrusion detection decisions.
Overall, the work highlights the importance of integrating explainable AI techniques with effective machine learning models to improve trust, transparency, and practical usability of intrusion detection systems in real-world cyber security applications.
Conclusion
This study presented an explainable intrusion detection framework using a Random Forest classifier trained on the CICIDS2017 dataset. The model achieved 92% accuracy while effectively distinguishing malicious from benign traffic. SHAP-based explanations enhanced transparency by revealing feature contributions influencing predictions. Results confirm that meaningful network flow features drive model decisions, making the approach suitable for practical cyber security applications. Overall, the results validate the effectiveness and interpretability of the proposed intrusion detection approach, making it suitable for practical cyber security and IoT network monitoring applications.
References
[1] Canadian Institute for Cybersecurity, “CICIDS2017 Dataset,” University of New Brunswick, 2017.
[2] L. Breiman, “Random Forests,” Machine Learning, vol. 45, no. 1, pp. 5–32, 2001.
[3] S. M. Lundberg and S. I. Lee, “A Unified Approach to Interpreting Model Predictions,” Advances in Neural Information Processing Systems (NeurIPS), 2017.
[4] R. Sommer and V. Paxson, “Outside the Closed World: On Using Machine Learning for Network Intrusion Detection,” IEEE Symposium on Security and Privacy, 2010.
[5] I. Sharafaldin, A. H. Lashkari, and A. A. Ghorbani, “Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization,” ICISSP, 2018
.